Search CORE

272 research outputs found

Regression with Linear Factored Functions

Author: CM Bishop
I-C Yeh
J Gerritsma
JA Nelder
JH Friedman
L Csató
LP Kaelbling
ME Tipping
P Cortez
P Tüfekci
W Böhmer
W Böhmer
Z Wang
Publication venue
Publication date: 30/03/2015
Field of study

Many applications that use empirically estimated functions face a curse of dimensionality, because the integrals over most function classes must be approximated by sampling. This paper introduces a novel regression-algorithm that learns linear factored functions (LFF). This class of functions has structural properties that allow to analytically solve certain integrals and to calculate point-wise products. Applications like belief propagation and reinforcement learning can exploit these properties to break the curse and speed up computation. We derive a regularized greedy optimization scheme, that learns factored basis functions during training. The novel regression algorithm performs competitively to Gaussian processes on benchmark tasks, and the learned LFF functions are with 4-9 factored basis functions on average very compact.Comment: Under review as conference paper at ECML/PKDD 201

arXiv.org e-Print Archive

Crossref

Predicting Fluid Intelligence of Children using T1-weighted MR Images and a StackNet

Author: A Pfefferbaum
DH Wolpert
DJ MacKay
EJ Paul
F Pedregosa
H Garavan
JH Friedman
KP Murphy
L Breiman
L Wang
M Luciana
ME Tipping
ND Volkow
P Geurts
SM Jaeggi
T Rohlfing
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

In this work, we utilize T1-weighted MR images and StackNet to predict fluid intelligence in adolescents. Our framework includes feature extraction, feature normalization, feature denoising, feature selection, training a StackNet, and predicting fluid intelligence. The extracted feature is the distribution of different brain tissues in different brain parcellation regions. The proposed StackNet consists of three layers and 11 models. Each layer uses the predictions from all previous layers including the input layer. The proposed StackNet is tested on a public benchmark Adolescent Brain Cognitive Development Neurocognitive Prediction Challenge 2019 and achieves a mean squared error of 82.42 on the combined training and validation set with 10-fold cross-validation. In addition, the proposed StackNet also achieves a mean squared error of 94.25 on the testing data. The source code is available on GitHub.Comment: 8 pages, 2 figures, 3 tables, Accepted by MICCAI ABCD-NP Challenge 2019; Added ND

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

Predictive gene lists for breast cancer prognosis: A topographic visualisation study

Author: C Ritz
D Lowe
D Lowe
David Lowe
G Hinton
IT Nabney
J Landgrebe
J Misra
J Wang
KY Yeung
L Ein-Dor
L Ein-Dor
LJ van't Veer
M Gormley
ME Tipping
ME Tipping
Mingmanas Sivaraksa
MJ van de Vijver
P Tamayo
S Hautaniemi
S Roweis
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background The controversy surrounding the non-uniqueness of predictive gene lists (PGL) of small selected subsets of genes from very large potential candidates as available in DNA microarray experiments is now widely acknowledged <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Many of these studies have focused on constructing discriminative semi-parametric models and as such are also subject to the issue of random correlations of sparse model selection in high dimensional spaces. In this work we outline a different approach based around an unsupervised patient-specific nonlinear topographic projection in predictive gene lists. Methods We construct nonlinear topographic projection maps based on inter-patient gene-list relative dissimilarities. The Neuroscale, the Stochastic Neighbor Embedding(SNE) and the Locally Linear Embedding(LLE) techniques have been used to construct two-dimensional projective visualisation plots of 70 dimensional PGLs per patient, classifiers are also constructed to identify the prognosis indicator of each patient using the resulting projections from those visualisation techniques and investigate whether <it>a-posteriori </it>two prognosis groups are separable on the evidence of the gene lists. A literature-proposed predictive gene list for breast cancer is benchmarked against a separate gene list using the above methods. Generalisation ability is investigated by using the mapping capability of Neuroscale to visualise the follow-up study, but based on the projections derived from the original dataset. Results The results indicate that small subsets of patient-specific PGLs have insufficient prognostic dissimilarity to permit a distinction between two prognosis patients. Uncertainty and diversity across multiple gene expressions prevents unambiguous or even confident patient grouping. Comparative projections across different PGLs provide similar results. Conclusion The random correlation effect to an arbitrary outcome induced by small subset selection from very high dimensional interrelated gene expression profiles leads to an outcome with associated uncertainty. This continuum and uncertainty precludes any attempts at constructing discriminative classifiers. However a patient's gene expression profile could possibly be used in treatment planning, based on knowledge of other patients' responses. We conclude that many of the patients involved in such medical studies are <it>intrinsically unclassifiable </it>on the basis of provided PGL evidence. This additional category of 'unclassifiable' should be accommodated within medical decision support systems if serious errors and unnecessary adjuvant therapy are to be avoided.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Aston Publications Explorer

Fast empirical Bayesian LASSO for multiple quantitative trait locus mapping

Author: AE Hoerl
Anhui Huang
CR Robert
EI George
F Hoti
G Schwarz
H Huang
H Wang
J Friedman
JA Nelder
ME Tipping
ME Tipping
N Yi
N Yi
N Yi
R Tibshirani
RB O'Hara
S Xu
S Xu
S Xu
Shizhong Xu
T Park
Xiaodong Cai
YM Zhang
ZW Luo
Ö Carlborg
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background The Bayesian shrinkage technique has been applied to multiple quantitative trait loci (QTLs) mapping to estimate the genetic effects of QTLs on quantitative traits from a very large set of possible effects including the main and epistatic effects of QTLs. Although the recently developed empirical Bayes (EB) method significantly reduced computation comparing with the fully Bayesian approach, its speed and accuracy are limited by the fact that numerical optimization is required to estimate the variance components in the QTL model. Results We developed a fast empirical Bayesian LASSO (EBLASSO) method for multiple QTL mapping. The fact that the EBLASSO can estimate the variance components in a closed form along with other algorithmic techniques render the EBLASSO method more efficient and accurate. Comparing with the EB method, our simulation study demonstrated that the EBLASSO method could substantially improve the computational speed and detect more QTL effects without increasing the false positive rate. Particularly, the EBLASSO algorithm running on a personal computer could easily handle a linear QTL model with more than 100,000 variables in our simulation study. Real data analysis also demonstrated that the EBLASSO method detected more reasonable effects than the EB method. Comparing with the LASSO, our simulation showed that the current version of the EBLASSO implemented in Matlab had similar speed as the LASSO implemented in Fortran, and that the EBLASSO detected the same number of true effects as the LASSO but a much smaller number of false positive effects. Conclusions The EBLASSO method can handle a large number of effects possibly including both the main and epistatic QTL effects, environmental effects and the effects of gene-environment interactions. It will be a very useful tool for multiple QTL mapping.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Γ-stochastic neighbour embedding for feed-forward data visualization

Author: Broomhead DS
Elzbieta P
Hinton GE
Iain Rice
Lee JA
Lowe D
Rice I.
Tipping ME.
Van der Maaten L
Van der Maaten LJP
Publication venue: 'SAGE Publications'
Publication date: 01/10/2018
Field of study

t-distributed Stochastic Neighbour Embedding (t-SNE) is one of the most popular nonlinear dimension reduction techniques used in multiple application domains. In this paper we propose a variation on the embedding neighbourhood distribution, resulting in Γ-SNE, which can construct a feed-forward mapping using an RBF network. We compare the visualizations generated by Γ-SNE with those of t-SNE and provide empirical evidence suggesting the network is capable of robust interpolation and automatic weight regularization

Crossref

Aston Publications Explorer

Direct estimation of wall shear stress from aneurysmal morphology: A statistical approach

Author: A Gooya
D Schiavazzi
J Xiang
JR Cebral
L Boussel
M Villa-Uriol
MD Ford
ME Tipping
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Computational fluid dynamics (CFD) is a valuable tool for studying vascular diseases, but requires long computational time. To alleviate this issue, we propose a statistical framework to predict the aneurysmal wall shear stress patterns directly from the aneurysm shape. A database of 38 complex intracranial aneurysm shapes is used to generate aneurysm morphologies and CFD simulations. The shapes and wall shear stresses are then converted to clouds of hybrid points containing both types of information. These are subsequently used to train a joint statistical model implementing a mixture of principal component analyzers. Given a new aneurysmal shape, the trained joint model is firstly collapsed to a shape only model and used to initialize the missing shear stress values. The estimated hybrid point set is further refined by projection to the joint model space. We demonstrate that our predicted patterns can achieve significant similarities to the CFD-based results

Crossref

The University of Manchester - Institutional Repository

Enlighten

White Rose Research Online

ABCD Neurocognitive Prediction Challenge 2019: Predicting individual fluid intelligence scores from structural MRI using probabilistic segmentation and kernel ridge regression

Author: A Pfefferbaum
A Rakotomamonjy
A Rao
A Woolgar
AMJ MacLullich
BJ Casey
C Blaiotta
CE Rasmussen
G Varoquaux
GC Monté-Rubio
GD Batty
GD Batty
IJ Deary
IJ Deary
IJ Deary
J Ashburner
J Gläscher
J Schrouff
J Schrouff
JP Rushton
KL Narr
LS Gottfredson
MA McDaniel
ME Tipping
NA Goriounova
Natacha Akshoomoff
NC Andreasen
Neil P. Oxtoby
RB McCall
Rex E. Jung
S Fors
S Karama
SB Blumberg
T Rohlfing
W Johnson
Publication venue
Publication date: 01/01/2019
Field of study

We applied several regression and deep learning methods to predict fluid intelligence scores from T1-weighted MRI scans as part of the ABCD Neurocognitive Prediction Challenge (ABCD-NP-Challenge) 2019. We used voxel intensities and probabilistic tissue-type labels derived from these as features to train the models. The best predictive performance (lowest mean-squared error) came from Kernel Ridge Regression (KRR;

\lambda=10

), which produced a mean-squared error of 69.7204 on the validation set and 92.1298 on the test set. This placed our group in the fifth position on the validation leader board and first place on the final (test) leader board.Comment: Winning entry in the ABCD Neurocognitive Prediction Challenge at MICCAI 2019. 7 pages plus references, 3 figures, 1 tabl

arXiv.org e-Print Archive

Crossref

UCL Discovery

MPG.PuRe

Radiocarbon dating of methane and carbon dioxide evaded from a temperate peatland stream

Author: AK Aufdenkampe
AK Koehler
C. Murray
D Hemming
D Vachon
E Mayorga
E Tipping
EA Davidson
F Nakagawa
I Forbrich
I Levin
J Chanton
JJ Cole
JP Chanton
K Lassey
KJ Dinsmore
KJ Dinsmore
KM Walter
KM Walter
L Chasar
M Stuiver
M. F. Billett
M. H. Garnett
ME Repo
MF Billett
MF Billett
MF Billett
MF Billett
MF Billett
MG Öquist
MH Garnett
MH Garnett
MH Garnett
MS Johnson
P Slota
P Steinmann
PG Langdon
RS Clymo
S. M. L. Hardie
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Streams draining peatlands export large quantities of carbon in different chemical forms and are an important part of the carbon cycle. Radiocarbon (14C) analysis/dating provides unique information on the source and rate that carbon is cycled through ecosystems, as has recently been demonstrated at the air-water interface through analysis of carbon dioxide (CO2) lost from peatland streams by evasion (degassing). Peatland streams also have the potential to release large amounts of methane (CH4) and, though 14C analysis of CH4 emitted by ebullition (bubbling) has been previously reported, diffusive emissions have not. We describe methods that enable the 14C analysis of CH4 evaded from peatland streams. Using these methods, we investigated the 14C age and stable carbon isotope composition of both CH4 and CO2 evaded from a small peatland stream draining a temperate raised mire. Methane was aged between 1617-1987 years BP, and was much older than CO2 which had an age range of 303-521 years BP. Isotope mass balance modelling of the results indicated that the CO2 and CH4 evaded from the stream were derived from different source areas, with most evaded CO2 originating from younger layers located nearer the peat surface compared to CH4. The study demonstrates the insight that can be gained into peatland carbon cycling from a methodological development which enables dual isotope (14C and 13C) analysis of both CH4 and CO2 collected at the same time and in the same way

Crossref

Enlighten

NERC Open Research Archive

Effect of fulvic acids on lead-induced oxidative stress to metal sensitive Vicia faba L. plant

Author: A Karaca
A Piccolo
AM El-Ghamry
AT Ruley
B Pourrut
C Dumat
C Dumat
C Dumat
C Marcato-Romain
C Marcato-Romain
C Sgherri
Camille Dumat
DM Hodges
E Islam
E Tipping
Eric Pinelli
G Uzu
G Uzu
H Babich
H El Hajjouji
HK Lichtenthaler
IA Worms
Jérôme Silvestre
K Zeng
M Halim
M Kleber
M Kruatrachue
M Shahid
M Shahid
ME Essington
Muhammad Shahid
MWH Evangelou
P Aarti
R Singh
S Cenkci
S Mishra
S Nardi
S Salati
S Staunton
T Debenest
T Debenest
TV Nedelkoska
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/02/2012
Field of study

Lead (Pb) is a ubiquitous environmental pollutant capable to induce various morphological, physiological, and biochemical functions in plants. Only few publications focus on the influence of Pb speciation both on its phytoavailability and phytotoxicity. Therefore, Pb toxicity (in terms of lipid peroxidation, hydrogen peroxide induction, and photosynthetic pigments contents) was studied in Vicia faba plants in relation with Pb uptake and speciation. V. faba seedlings were exposed to Pb supplied as Pb(NO3)2 or complexed by two fulvic acids (FAs), i.e. Suwannee River fulvic acid (SRFA) and Elliott Soil fulvic acid (ESFA), for 1, 12, and 24 h under controlled hydroponic conditions. For both FAs, Pb uptake and translocation by Vicia faba increased at low level (5 mg l−1), whereas decreased at high level of application (25 mg l−1). Despite the increased Pb uptake with FAs at low concentrations, there was no influence on the Pb toxicity to the plants. However, at high concentrations, FAs reduced Pb toxicity by reducing its uptake. These results highlighted the role of the dilution factor for FAs reactivity in relation with structure; SRFA was more effective than ESFA in reducing Pb uptake and alleviating Pb toxicity to V. faba due to comparatively strong binding affinity for the heavy metal

Crossref

Open Archive Toulouse Archive Ouverte

HAL-INSU

HAL-IRD

R-Gada: a fast and flexible pipeline for copy number analysis in association studies

Author: A Caceres
AB Olshen
AJ Iafrate
AL Price
Alejandro Cáceres
DF Conrad
G Perry
H Willenbrock
JM Kidd
Juan R González
K Wang
L Winchester
ME Tipping
MJ Greenacre
R Pique-Regi
R Redon
Roger Pique-Regi
S McCarroll
TA Manolio
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Genome-wide association studies (GWAS) using Copy Number Variation (CNV) are becoming a central focus of genetic research. CNVs have successfully provided target genome regions for some disease conditions where simple genetic variation (i.e., SNPs) has previously failed to provide a clear association. Results Here we present a new R package, that integrates: (i) data import from most common formats of Affymetrix, Illumina and aCGH arrays; (ii) a fast and accurate segmentation algorithm to call CNVs based on Genome Alteration Detection Analysis (GADA); and (iii) functions for displaying and exporting the Copy Number calls, identification of recurrent CNVs, multivariate analysis of population structure, and tools for performing association studies. Using a large dataset containing 270 HapMap individuals (Affymetrix Human SNP Array 6.0 Sample Dataset) we demonstrate a flexible pipeline implemented with the package. It requires less than one minute per sample (3 million probe arrays) on a single core computer, and provides a flexible parallelization for very large datasets. Case-control data were generated from the HapMap dataset to demonstrate a GWAS analysis. Conclusions The package provides the tools for creating a complete integrated pipeline from data normalization to statistical association. It can effciently handle a massive volume of data consisting of millions of genetic markers and hundreds or thousands of samples with very accurate results.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central